3D服装重建的现有方法要么假设服装几何形状的预定义模板(将其限制为固定服装样式),要么产生顶点有色网眼(缺少高频纹理细节)。我们的新型框架共同学习的几何和语义信息来自输入单眼图像,用于无模板纹理的3D服装数字化。更具体地说,我们建议扩展去皮的表示,以预测像素对齐的分层深度和语义图以提取3D服装。进一步利用分层表示,以参数化提取服装的任意表面,而没有任何人类干预以形成紫外线图集。然后,通过将像素从输入图像从输入图像投射到可见区域的UV空间,然后以混合方式将纹理以混合方式赋予,然后添加封闭的区域。因此,我们能够将任意放松的衣服样式数字化,同时从单眼图像中保留高频纹理细节。我们在三个公开可用的数据集中获得了高保真3D服装重建结果,并在Internet图像上概括。
translated by 谷歌翻译
在本文中,我们开发了一种强大的3D服装数字化解决方案,可以在现实世界时尚目录图像上概括用布纹理遮挡和大体姿势变化。我们假设已知类型的服装类型的固定拓扑参数模板网格模型(例如,T恤,裤子),并从输入目录图像执行高质量纹理的映射到与衣服的参数网格模型相对应的UV映射面板。我们通过首先预测服装边界的稀疏2D地标。随后,我们使用这些地标在UV地图面板上执行基于薄板样条的纹理传输。随后,我们使用深度纹理修复网络来填充TPS输出中的大孔(由于查看变化和自闭电),以产生一致的UV映射。此外,为了培训监督的地标预测和纹理修复任务,我们产生了一大组合成数据,其具有不同于各种姿势的各种视图的不同纹理和照明。此外,我们手动注释了一小组时尚目录图像从在线时尚电子商务平台到Finetune。我们开展彻底的经验评估,并在时尚目录图像上显示我们所提出的3D服装纹理解决方案的令人印象深刻的定性结果。这种3D服装数字化有助于我们解决启用3D虚拟试验的具有挑战性的任务。
translated by 谷歌翻译
3D单眼图像的人体重建是在多个域中具有更广泛应用的计算机视觉中有趣和不良的问题。在本文中,我们提出了一种新颖的端到端培训网络,可从单眼图像中准确地恢复3D人的详细几何和外观。在衣服模型的非参数去皮深度图表示之前,我们提出了稀疏和有效的参数体融合。参数正文以两种方式进行了限制我们的模型:首先,网络保留不受衣服封闭的几何一致身体部位,而第二件,它提供了改善剥离深度图的预测的身体形状上下文。这使得能够在给定输入图像的情况下,在2D地图上的L1损耗仅恢复细粒度的3D几何细节。我们在公开可用的布料3D和Thuman数据集中评估夏普,并向最先进的方法报告卓越的性能。
translated by 谷歌翻译
Psychology research has long explored aspects of human personality such as extroversion, agreeableness and emotional stability. Categorizations like the `Big Five' personality traits are commonly used to assess and diagnose personality types. In this work, we explore the question of whether the perceived personality in language models is exhibited consistently in their language generation. For example, is a language model such as GPT2 likely to respond in a consistent way if asked to go out to a party? We also investigate whether such personality traits can be controlled. We show that when provided different types of contexts (such as personality descriptions, or answers to diagnostic questions about personality traits), language models such as BERT and GPT2 can consistently identify and reflect personality markers in those contexts. This behavior illustrates an ability to be manipulated in a highly predictable way, and frames them as tools for identifying personality traits and controlling personas in applications such as dialog systems. We also contribute a crowd-sourced data-set of personality descriptions of human subjects paired with their `Big Five' personality assessment data, and a data-set of personality descriptions collated from Reddit.
translated by 谷歌翻译
Many real-world applications of language models (LMs), such as code autocomplete and writing assistance, involve human-LM interaction, but the main LM benchmarks are non-interactive, where a system produces output without human intervention. To evaluate human-LM interaction, we develop a framework, Human-AI Language-based Interaction Evaluation (H-LINE), that expands non-interactive evaluation along three dimensions, capturing (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality. We then design five tasks ranging from goal-oriented to open-ended to capture different forms of interaction. On four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21's J1-Jumbo), we find that non-interactive performance does not always result in better human-LM interaction and that first-person and third-party metrics can diverge, suggesting the importance of examining the nuances of human-LM interaction.
translated by 谷歌翻译
Bike sharing systems often suffer from poor capacity management as a result of variable demand. These bike sharing systems would benefit from models to predict demand in order to moderate the number of bikes stored at each station. In this paper, we attempt to apply a graph neural network model to predict bike demand in the New York City, Citi Bike dataset.
translated by 谷歌翻译
A hallmark of human intelligence is the ability to learn new concepts purely from language. Several recent approaches have explored training machine learning models via natural language supervision. However, these approaches fall short in leveraging linguistic quantifiers (such as 'always' or 'rarely') and mimicking humans in compositionally learning complex tasks. Here, we present LaSQuE, a method that can learn zero-shot classifiers from language explanations by using three new strategies - (1) modeling the semantics of linguistic quantifiers in explanations (including exploiting ordinal strength relationships, such as 'always' > 'likely'), (2) aggregating information from multiple explanations using an attention-based mechanism, and (3) model training via curriculum learning. With these strategies, LaSQuE outperforms prior work, showing an absolute gain of up to 7% in generalizing to unseen real-world classification tasks.
translated by 谷歌翻译
Large Language Models (LLMs) have been the subject of active research, significantly advancing the field of Natural Language Processing (NLP). From BERT to BLOOM, LLMs have surpassed state-of-the-art results in various natural language tasks such as question answering, summarization, and text generation. Many ongoing efforts focus on understanding LLMs' capabilities, including their knowledge of the world, syntax, and semantics. However, extending the textual prowess of LLMs to symbolic reasoning has been slow and predominantly focused on tackling problems related to the mathematical field. In this paper, we explore the use of LLMs for automated planning - a branch of AI concerned with the realization of action sequences (plans) to achieve a goal, typically executed by intelligent agents, autonomous robots, and unmanned vehicles. We introduce Plansformer; an LLM fine-tuned on planning problems and capable of generating plans with favorable behavior in terms of correctness and length with reduced knowledge-engineering efforts. We also demonstrate the adaptability of Plansformer in solving different planning domains with varying complexities, owing to the transfer learning abilities of LLMs. For one configuration of Plansformer, we achieve ~97% valid plans, out of which ~95% are optimal for Towers of Hanoi - a puzzle-solving domain.
translated by 谷歌翻译
Chatbots, or bots for short, are multi-modal collaborative assistants that can help people complete useful tasks. Usually, when chatbots are referenced in connection with elections, they often draw negative reactions due to the fear of mis-information and hacking. Instead, in this paper, we explore how chatbots may be used to promote voter participation in vulnerable segments of society like senior citizens and first-time voters. In particular, we build a system that amplifies official information while personalizing it to users' unique needs transparently. We discuss its design, build prototypes with frequently asked questions (FAQ) election information for two US states that are low on an ease-of-voting scale, and report on its initial evaluation in a focus group. Our approach can be a win-win for voters, election agencies trying to fulfill their mandate and democracy at large.
translated by 谷歌翻译
This paper presents a new approach for analyzing and identifying potentially useful generalized plans. It presents a new conceptual framework along with an algorithmic process for assessing termination and reachability related properties of generalized plans. The presented framework builds upon classic results on the analysis of graphs to decompose generalized plans into smaller components in a novel algorithm for conducting a hierarchical analysis for termination of arbitrary generalized plans. Theoretical analysis of the new framework establishes soundness of the presented algorithms and shows how it goes beyond existing approaches; empirical analysis illustrates the scope of this approach. Our analysis shows that this new approach can effectively identify termination for a significantly larger class of generalized plans than was possible using existing methods.
translated by 谷歌翻译